System and method for distributed media personalization

ABSTRACT

A system for generating edited video, the system comprises a storage module configured to store a plurality of video tags; a network interface configured to receive an unedited video clip and a video tag selection from a user device and automatically transmit and edited video to a video sharing website; and a transformer module configured to generate an edited video based on the unedited video clip and the video tag selection.

RELATED APPLICATIONS INFORMATION

The application claims the benefit under §119(e) of U.S. ProvisionalApplication Ser. No. 61/357,442 filed Jun. 22, 2010 and entitled “Systemand Method for Media Personalization,” and also claims the benefit under§119(e) of U.S. Provisional Application Ser. No. 61/357,443 filed Jun.22, 2010 and entitled “System and Method for Distributed MediaPersonalization,” and also claims the benefit under §119(e) of U.S.Provisional Application Ser. No. 61/357,447 filed Jun. 22, 2010 andentitled “System and Method for Remote Media Processing Selection andPreview,” all of which are incorporated herein by reference in theirentirety as if set forth in full.

FIELD OF THE INVENTION

The embodiments described herein generally relate to the fields of mediaprocessing and network communication systems and more specifically tosystems and methods for personalizing media using a communicationnetwork.

BACKGROUND

With the explosion of social networking, cloud storage and computing,faster network speeds, and smart phones and tablets with videocapability, people are capturing and sharing video in greater andgreater amounts. Thus, it is not uncommon for mom or dad to capture avideo of their child being dropped off at school, participating in anactivity, or just running around the house with their smart phone andthen immediately email the video to friends and family or post it on asocial networking page. Often, however, the quality of these videos isnot very good. The image is choppy and bounces around, there is littleor no audio, etc. In addition, the video is not very professionallooking, i.e., there is no title, introduction, sound track, etc.:Things that can be done to make even impromptu videos such as thosedescribed above compelling to even an uninterested observer if donewell. So while the video is very interesting and meaningful for mom anddad, it may not be as interesting to everyone else.

As a result, there are application available that will allow a user toedit their video and generate a much more refined productions in whichsome of the choppiness is smoothed out, filtering is applied to enhancethe video quality, sound effects are applied and synchronized with theimages, a theme can be applied, etc. But often these tools require alarger investment of time than the average user is willing to commit.Unfortunately, the conventional resources required to perform suchediting do not allow for quick, easy editing that can produce a moreinteresting and professional video. Further, conventional devices usedto capture video or to share video over the internet are often resourceconstrained. For example, the device may be limited by processingcapability, power, or other resources. As a result, the resulting videotypically will lack editing or other features that would greatly improvethe quality of the shared video.

SUMMARY

Systems and methods for turning video clips captured with a user deviceinto polished final products and for automatically sharing them aredisclosed herein.

A system for generating edited video, the system comprises a storagemodule configured to store a plurality of video tags; a networkinterface configured to receive an unedited video clip and a video tagselection from a user device and automatically transmit and edited videoto a video sharing website; and a transformer module configured togenerate an edited video based on the unedited video clip and the videotag selection.

A device for generating editing video, the system comprises a storagemodule configured to store an unedited video and information related toone or more video tags; a user interface configured to receive aselection of one of the one or more video tags; and a network interfaceconfigured to transmit the unedited video, the selection of one of theone or more video tags, and an indication of a video sharing websitewhere an edited version of the unedited video is to be published to aremote server.

A system for generating edited video, the system comprises a remoteserver, the server comprising a server storage module configured tostore a plurality of video tags, a server network interface configuredto receive an unedited video clip and a video tag selection from a userdevice and automatically transmit and edited video to a video sharingwebsite, and a transformer module configured to generate an edited videobased on the unedited video clip and the video tag selection; and a userdevice for generating editing video, the system comprising a user devicestorage module configured to store an unedited video and informationrelated to one or more of the video tags, a user interface configured toreceive a selection of one of the one or more video tags, and a devicenetwork interface configured to transmit the unedited video, theselection of one of the one or more video tags, and an indication of avideo sharing website where an edited version of the unedited video isto be published to the remote server.

BRIEF DESCRIPTION OF THE DRAWINGS

The details of the present invention, both as to its structure andoperation, may be gleaned in part by study of the accompanying drawings,in which like reference numerals refer to like parts, and in which:

FIG. 1A is a block diagram of networked computer systems forpersonalizing and sharing media according to an embodiment;

FIG. 1B is a block diagram of a user device according to an embodiment;

FIG. 2 is a functional block diagram of networked computer systems forpersonalizing and sharing media according to another embodiment;

FIG. 3 is a block diagram of a media clip file format according to anembodiment;

FIG. 4 is a functional block diagram and flow diagram for networkedcomputer systems for personalizing and sharing media according toanother embodiment;

FIGS. 5A-F are illustrations of a user interface for personalizing andsharing media according to an embodiment; and

FIG. 6 is a block diagram of a media clip format according to anembodiment;

FIG. 7 is a flowchart of a method for personalizing and sharing mediaaccording to an embodiment.

DETAILED DESCRIPTION

Systems and methods media personalization and sharing are presented. Inone embodiment, the present method and tool or system allows a user totake a video clip that was just shot, e.g., with a mobile device andturn it into something personalized and polished and route it to desireddestinations quickly and easily. For example, in some implementations,the tool or system allows or enables a user to apply a Video Tag, alsoreferred to as a Personalization, to media from any device. Further, thesystem allows a user to modify and configure a Video Tag from anydevice. The user can also preview the modified video in real-time. As aresult, a user can quickly and easily generate polished video, previewit, and share with friends or post it to a social networking page.

As used herein, a Video Tag, Video Tag template, Production, orProduction template includes a set of instructions to personalize media.In one embodiment, the Video Tag defines a set of effects to apply to avideo along with embellishments such as music and title style. The VideoTag defines the process for producing a video clip in a particular styleand also includes options for user modification. In general, a Video Tagrepresents a configuration that can include a sequence of images, videoclips, titling, sounds effects, transitions, mixing, etc., to wrap thatvideo clip into a complete polished video.

As noted above, in some circumstances, the devices used to capture videoor to share video over the internet are resource constrained. Forexample, a mobile phone or other device may be limited by processingpower, bandwidth, power, or other resources when editing and sharingvideo. In one embodiment, a client application is provided to the userof a resource constrained device that facilitates video personalizationwithout excessive consumption of resources. A corresponding applicationis provided on, for example, a server, to perform portions of the videocustomization and sharing. As described below, this advantageouslyfacilitates sophisticated media personalization despite resourceconstrained user devices.

In other circumstances, a user may use multiple devices to capturevideo. For example, a user may have a cell phone capable of capturingvideo as well as another mobile device with similar capturecapabilities. A user may have a set of preferences for personalizingvideo regardless of capture device. In one embodiment, systems andmethods are provided for making a user's personalization and sharingsettings available across multiple devices. In other circumstances,systems and methods are provided to taking simple video and convertingit into a fully polished and personalized production. Variouscombinations of the embodiments described herein are possible.

Turning now to FIG. 1A, a block diagram of networked computer systems100 for personalizing and sharing media is shown. The networked computersystems 100 include a user device 10, a network 30, and a server 20. Theuser device 10 is communicatively coupled to the network 30. The server20 is also communicatively coupled to the network 30. The user deviceand server communicate via the network 30.

In one embodiment, the user device 10 is a machine with the ability tocommunicate over the network 30. For example, in one embodiment, thedevice 10 is a personal computer with a network connection such as anInternet connection or a wireless device, such as a mobile telephone ora personal digital assistant, with access to a wireless network. Theuser device 10 comprises an application module 15. As described ingreater detail below, the application module 15 operates in conjunctionwith the server 20 to accomplish the media personalization and sharingdescribed herein.

It will be understood that conventional devices, such as those describedabove with respect to device 10 often include great hardware for videocompression so they can squeeze a large amount of video data through thenetwork 30. Often such devices are therefore optimized for greattranscoding of the video, which often makes the process of compressingvideo, uploading it to a server 20 where it can be processed,recompressed, and downloaded to device 10 faster than processing thevideo on the device, but also allows more sophisticated processing asdescribed below.

For example, in one embodiment, the application module 15 is configuredto access media, such as a video stored on the device 15. Theapplication module 15 is configured to upload this media to the server20 via the network 30. The application module 15 is also configured toreceive input from a user of the device 10 in the form of customizationoptions. The application module 15 transmits this input to the server20. In some embodiments, the application module 15 receives media fromthe server that are previews of what the personalized media will looklike once it is fully processed. The application module 15 displaysthese previews to the user of the device 10. The user can provideadditional input to the application module 15 based on the previews. Theapplication module 15 then sends the additional input to the server 20via the network 30. In some embodiments, this process of receivingpreviews, presenting the previews to the user, and sending additionalinput to the server continues at the application module 15 until theuser is satisfied with the personalized media. After receiving anindication of acceptance, the application module 15 is configured totransmit an indication of the acceptance to the server 20. In someembodiments, the application module also solicits input from the user onhow the personalized media should be shared. The application moduletransmits this input on how the personalized media is to be shared orpublished, e.g., to facebook, youtube, or another internet site. Otherexamples of the operation of the device 10 and the application module 15are described in greater detail below.

In one embodiment, the server 20 is a computer system with a networkconnection such as a server, a personal computer, or other devicecapable of performing the processes described herein and communicatingover a network. The server 20 comprises a transformer module 25. Asdescribed in greater detail below, the transformer module 25 operates inconjunction with the application module 15 in order to accomplish themedia personalization and sharing described herein.

It will be understood that server 20 is intended to be representative ofthe resources needed to carry out and implement the systems and methodsdescribed. As such, server 20 can comprise multiple servers, routers,processors, databases, storage devices, applications, APIs, programs,user interfaces, etc., as needed or required by a particularimplementation. Moreover, it will be understood that many, if not all ofthese resources can reside in the cloud.

In one embodiment, the transformer module 25 receives media, such avideo clip, from the user device 10 as well as an indication of one ormore customization options. The transformer module 25 processes themedia according to the customization options. In some embodiments, thetransformer module 25 also generates previews of the personalized mediaand transmits the previews to the user device 10. These previews can beprovided in real-time or near real-time as described below. Thetransformer module 25 receives additional feedback from the user device10 in response to the transmitted previews and further processes themedia responsive to the feedback. In some embodiments, the transformermodule generates and transmit new previews to the user device 10 basedon the feedback. In some embodiments, after one or more of thesefeedback cycles, the transformer module 25 receives an indication of anacceptance of the media personalization. The transformer module thenshares personalized media according to input received from the device10. Other examples of the operation of the server 20 and the transformermodule 25 are described in greater detail below.

In one embodiment, the network 30 comprises a communication network suchas the Internet, a local area network, a wide area network, a virtualprivate network, a telephone network, a cellular network, a directconnection a combination of these networks or another type ofcommunication network.

Advantageously, the systems 100 operate in conjunction to allow resourceconstrained devices to perform sophisticated, user controlled, mediapersonalization and to share the personalized media.

FIG. 1B is a block diagram of a user device 10. In one embodiment, theuser device 10 is similar to the user device 10 of FIG. 1A. The userdevice 10 comprises a processor 31, a network interface 32, a storage33, a user interface 34, and a media capture device 35. The processor iscommunicatively coupled to the network interface 32, storage 33, userinterface 34, and media capture device 35. In one embodiment, theprocessor 31 is configured to implement the functionality describedherein with respect to the application module 15 and the device 10.

The network interface 32 transmits and receives messages via the network30. For example, the network interface 32 transmits media, such asvideos, to the server 20 via the network 30 and receives previews fromthe server 20 via the network 30. The storage 33 is a tangible, computerreadable medium. It stores, for example, media, such as videos andinstructions for causing the processor 31 to perform the functionalitydescribed with respect to the application module 15 and the device 10.

The user interface 34 comprises an interface for presenting informationto a user of the device or for receiving input from a user, or both. Forexample, the user interface may comprise a touch screen for presentingpreviews to a user and for accepting input regarding personalizationoptions regarding the media being personalized. Other types of devicesthat communicate to a user or receive information from a user may alsobe used. The media capture device 35 comprises a device such as a cameraor microphone that captures media. The captured media can be stored inthe storage 33 and processed by the processor 33 as described herein.Examples of a user device, in this case a mobile user device, areillustrated in FIGS. 5A-F.

FIG. 2 is a functional block diagram of networked computer systems forpersonalizing and sharing media. In particular, additional details ofthe transformer module 25 are shown. As described herein, thetransformer module 25 applies a Video Tag to media to generatepersonalized output. As shown in FIG. 2, in one embodiment, thetransformer module 25 includes user account storage 40, preset storage60, control logic 50, processor 36, storage 37, and network interface38. The control logic 50 is communicatively coupled to the user accountstorage 40, preset storage 60, storage 37, processor 36, and networkinterface 38 and controls the functioning of these other elements of thetransformer module 25.

In one embodiment, the user account storage 40 stores account records orinformation 44 for individual users. These account records can be uniquefor each user. These account records 44 can include one or more VideoTags 42 associated with a user's account. As noted above, a Video Tag orVideo Tag template is a set of instructions to personalize media. In oneexample, the Video Tag defines a set of effects to apply to a videoalong with embellishments such as music and title style. Video Tagsdefine the process for producing a video clip in a particular style andalso include options for user modification. A user account 44 can haveone or more Video Tags 42 associated with it in the user account storage40. The set of effects for a particular Video Tag can have defaultsettings for various settings. Users can modify the default settings fordefault Video Tags and thereby create modified or customized Video Tags.Advantageously, by providing default settings, users can quickly selectand apply a Video Tag. At the same time, by allowing users to customizeor modify Video Tags, a high degree of personalization is provided. Ingeneral, modified Video Tags are Video Tags that have been changed inany manner from a default state by a user. Preset storage 60 includes aplurality of preset Video Tags 62 having their default values. Theaccount associated Video Tags 42 can include preset Video Tags 62 ormodified versions of the preset Video Tags 62 or both.

In addition each user account record 44 can contain additionalinformation. In one embodiment, this additional information can includerecords that are used for billing. In one particular example, theAccount record 44 includes information on the minutes of processingapplied to media uploaded by a user associated with the account tofacilitate usage based billing. In another embodiment, the accountrecord 44 for a user contains rules for sharing or propagating thepersonalized media. These rules can include information such as where toplace the media, e.g., Youtube, Facebook, etc. In one embodiment, theserules are set based on input received from the user device 10. Forexample, in one embodiment, the user selects and identifies where thevideos are transmitted and posted. The rules can include information forsending media to services. This information can include account id's,passwords, file format descriptions, or other information. In oneembodiment, a setup wizard or other program runs on the user device 10in order to walk a user through the configuration of different settingsand collect user input for such rules. In one embodiment, the types ofinformation collected and stored by the user device 10 and transmittedto the transformer module 25 depend in part on the intended destinationfor the media, e.g., Youtube or other website. In some embodiments, theAccount record 44 also contains other information such as nicknames,photos, personalized media, or unedited media.

The storage 37 is a tangible computer readable medium that storesinformation for use by the other components of transformer module 25. Inone example, the storage 37 stores unprocessed media from user devices,partially processed media such as previews, and fully processed mediasuch as personalized media that will be shared. In one embodiment, thestorage 37 also stores instructions for causing the transformer module25 and its elements to perform the functionality described herein withrespect to the transformer module 25 and the server 20

The processor 36 is configured to process received media according tothe information in a video tag as well as other input received at thetransformer module 25 from a user device 10. For example, the processor36 takes one or more media files, e.g., videos, and other user input,e.g., selection of a Video Tag, received by the transformer module 25and uses them to create personalized media e.g., video file, based inpart on the instructions in the selected Video Tag.

The network interface 38 transmits and receives information over anetwork such as the network 30. For example, the network interface 38receives unedited media from the user device 10 and transmitspersonalized media to video sharing sites 80.

As described in more detail below, in one embodiment, a transformermodule 25 implemented in a server 20 is used to apply a Video Tag tomedia to generate personalized output. The transformer module 25 managesa database of users 40 and Video Tags 42 and 62. A remote applicationmodule 15 communicates with the transformer module 25 via a network 30to process a video clip. The remote application module 15 sends thetransformer module 25 the video clip (or other media), with instructionsfor processing (Video Tag choice), and desired destination.

For the purpose of explanation, one embodiment of the communicationbetween the application module and transformer module 25 will now bedescribed. In this example, the application module 15 communicates withthe transformer module 25 via https or any appropriate protocol over thenetwork 30. This communication link can be a wireless or wiredconnection or both. The Application module 15 logs into user accountmodule 40 to access the user's private set of Video Tags 42 and accountinformation 44. The Application module 15 downloads Video Tag choicesfrom transformer module 25 from the user account module 40, e.g., thevideo tags 25, or the public Video Tag list (e.g., Video Tags 62)provided by preset storage 60, or both.

Next, via the Application module 15, the user chooses a media, a VideoTag, and selects some options, including typing in a name for the clip.Application module 15 uploads the media (video clip, bitmaps, sounds,etc.) and user choices to the transformer module. Processor 36 uses theselected Video Tag and user selected options to control the creation ofa finished video. In one embodiment, a single video is used to createthe finished video. In some embodiments, two or more videos can be usedin creating the finished video. In one embodiment, creating the finishedvideo comprises applying one or more effects to the video. In general,an effect, or filter, may be an operation that is applied to the framesin a video in order to impart a particular look or feel. For example, astabilization or smoothing effect may be applied to the frames in thevideo. In one embodiment, effects may be distinguished from transitionsthat are designed to alter the way in which a video begins or ends orthe way in which one video moves into another video. A sample of effectsmay be viewed at http://www.newbluefx.com/.

Continuing on, transformer module 25 posts the finished video to userselected video sharing and social networking sites 80. As describedabove, the video sharing and social networking sites can have differentformats and require different information when receiving uploadedvideos. The transformer module 25 can use information in the useraccount 40, e.g., rules for propagating the finished videos, whenposting the finished video. In another embodiment, the transformermodule 25 can store information about the requirements and interface foruploading video to different sites or services. Additionally, the videocan be sent by email, etc., to others. In other embodiments, the userapplication can collect rules for sharing the along with the selectionof a video tag and transmit the rules along with the video tagselection.

Production Files

A Production file is an XML file, or other format file, that carriesinformation used by the processor 36 to assemble a video project. In oneembodiment, the Production is an intermediate file representing a VideoTag with all of the Options and Variables replaced. Options andVariables are described in greater detail below. The Production can begenerated by a parser operating on a Video Tag. In one embodiment, theProduction can be generated after a first pass of a parser through aVideo Tag. In general, a Production file is a specific plan for creatinga video that comprises instructions for the processor on how to assemblethe video.

In one embodiment, a Production includes one or more video and audiotracks. Within each track, in time order, are segments, which in includean input source, a trim in point, a start time relative to the end ofthe previous segment, a duration, one or more Plugin effects, and one ormore transitions. The input source can be a media file reference such asa sound effect, a video clip, a photo, or other reference. The inputsource can also be a software media generator such as Titler, Backgroundsurface, or other generator. The duration can identify a specific lengthor be set by the length of the media. Plugin effects are referenced byname and including settings. Transitions are referenced by name andinclude transitions for into the segment with settings and transitionsout to the next segment plus settings.

Video Tag

In one embodiment, the Video Tags or Video Tag templates 42 and 62 areXML (or other format) files that are similar to Production files.However, in Video Tags one or more of the strings within it have beenreplaced by unique tokens. These tokens are placeholders for optionstrings. Video tags also include one or more Option Sets. In oneembodiment, an Option Set includes a category name, e.g., “Title Style”,and a series of Option presets. In one embodiment, the option presentsinclude a set of tokens that match tokens in the production and, foreach token, the string (or data) to replace it with. In someembodiments, this string can be a simple preset name or a block of XMLor other information. For example, the replacement text can be just asimple string. In another embodiment a complex block of text can beused. In some embodiments, an XML attribute can be used. An attributecan be just one parameter. An attribute can also be a set of parameters.For explanation, as described below, an effect may be applied to aSegment. A string representing the effect can comprise a block of textthat includes nested attributes for each parameter. Video Tags alsoinclude external variable definitions. External variable definitionsinclude a token identifier (ID) and a name, e.g., “Title.”

FIG. 3 shows an exemplary Video Tag template 62. In this exemplary VideoTag template 62, there are two video tracks (Video 1 and Video 2) andone audio track (Audio 1). The top video track has Intro Slide 63, Title64, and Video Clip Segments 65.

The Intro Slide segment 63 media is a bitmap picture with some effectsapplied to it. It is preceded and followed by Transition sections 66.One of the Option Sets 67 contains information for used by the processor36 to control the transitions in and out of the segment. This means thatchoosing different Options chooses different preset transition choices.

The title segment 64 is generated by a Titling plugin. Underlying thetitle segment 64 is a second track 68 which generates the background. AVariable 69 contains the information for the actual name in the Titlesegment 64. This configuration allows the application module 15 to setthe title with a text string. Additionally, an external Option Set 71contains information used by the processor 36 to set some effects on theclip to achieve a particular look. This is an external set because thiscan be used to manipulate the look in different Video Tags. Because itis external, it looks to the Video Tag itself as if it is just anotherexternal Variable.

The Video Clip segment is a video file. This is the File that the wholeVideo Tag was designed to turn into a work of art. For example effects(e.g., FX 72) can be applied by the processor 36 to the contents of avideo clip, such that the video clip is modified by the effects. Sincethe actual file choice is determined by the user, the Variable 73provides a mechanism to set the name of the file externally. In oneexample, the transformer module 25 uses Video Tag 62 of FIG. 3, one ormore media clips, and optionally some commands from a user and generatesa finished video.

For example, a particular production may comprise multiple videos andother media that are overlaid or mixed and displayed or rendered at thesame time. These can be blended through the process of “compositing”.The processing engine can be a powerful video compositor that blendsmultiple overlaying video tracks. Many of these video tracks canincorporate an alpha channel that defines transparency and opacity.Where the track is transparent, the underlying image can be seen throughit. The underlying image can be another video track, or a photo, etc.The titles can work the same way, in that each title can be generated bysoftware that creates an image that has letters with alpha transparencyaround them, so the letters can overly photographs, videos, etc.

In order to illustrate the functionality of the systems describedherein, one of example of the operation of the networked systems 100 isdescribed. Via the user interface on the user device, the user chooses aVideo Tag from a set of available Video Tags. As described above withrespect to FIG. 2, the Video Tags can be selected from the user'sprivate Video Tags 42 or a public Video Tag list 62. In otherembodiments, the Video Tag may come from any source. For example, theVideo Tag may be generated by the user, selected from pre-existinglists, or shared with other users.

The Video Tag provides Options in several categories. The user choosesan option by name or icon via the user interface. The Video Tag alsoincludes named Variables. These represent data that is input directlyinto the project by the user via the user interface. Two examples arethe title text and the file name of the Video Clip. In some embodiments,there are also Options that are stored in an external Options File, thatare used to set one or more variables in the Video Tag. In one example,an external Options file contains an Options Set that sets theBackground color and Title font. By storing the Options as a separatefile, the file and its Options can be used for multiple Video Tags. Inone embodiment, the external Option files are stored on the server 20.In one example, the user selects an Option via the user interface. Theselected Option is mapped to the destination Video Tag as one or moreVariables. In another example, a user selects External Option Set suchas the External Option Set 75 of FIG. 3 via the user interface. TheExternal Option Set sets the “Look” of the video by providing a set ofFX presets to choose from.

Once input corresponding to the Video Tag, Options, and Variables hasbeen collected by the application module, the input is transmitted tothe transformer module 25 of the server 20. The processor 36 of thetransformer module then parses the input. Parsing creates a Project filedynamically by substituting Options and Variables for all tokens. In oneembodiment, the parsing is done as a search and replace operation by theprocessor. For example, for each token in the Video Tag, the equivalenttoken in a Variable or Option is located and the string or data for thetoken is substituted with the Variable or Option information.

In one embodiment, the processor applies Variables after Options. Thisallows a Variable to be embedded with an Option. For example, thisallows a set of choices for a media file (the Options) including oneoption to provide your own media file, which in turn is managed with aVariable. As described above, in one embodiment, a variable is userdefined. Variables can be combined with options. For example, instead ofchoosing a predetermined string, a user can also provide one. In anotherexample, instead of selecting one of a plurality of media choices, e.g.,bitmaps or sound files, the user may have the choice to provide the fileas well. Similarly, a choice of a combination of Title font and overlaideffect is an option. A choice to set the bitmap for the title from apreset list is an Option while entering a new file is a Variable.Setting the video clip to be processed is a Variable.

After parsing the user input and creating the project file, theProcessor 36 can create the personalized media. In one example, theProcessor 36 takes the Project file and converts it into a time stampedsequential list of media Segments. In some embodiments, the parsing isconcurrently with the processing described here. Thus, the step ofcreating the Project file at the processor 36 is optional in someembodiments.

Each Segment represents a portion of media to use. The segments includeone or more of a track, a start time, a duration, a starting offset, asource indication (e.g., a file or Generator plugin with parameters), atransition in (including duration, an indication of the transitionplugin to use, and a parameter preset to use), a transition out(including duration, an indication of the transition plugin to use, aparameter preset to use, and a destination), and one or more effects(including an effect plugin to use and a parameter preset to use).

In one embodiment, in the case of a video file, the Processor 36 createsone frame of the resulting personalized media at a time. While oneembodiment of this process is described, it will be appreciated thatother processes may be used to achieve similar outputs. For each frameor time stamp of the personalized media, the processor 36 performs thefollowing steps.

First, the processor identifies each segment that is active at theparticular time stamp or frame. Second, the active segments are sortedby track such that highest numbered track is handled first. This setsthe order for compositing. Third, the processor initializes a blankmaster frame buffer and a blank master audio buffer. Fourth, for eachactive video segment, the processor obtains the media for the frame,e.g., video, audio, or image uploaded from the application module, andplaces the media in a frame buffer. The processor then applies one ormore effects from the effect list to the media, applies any transitionin or out that overlaps with the frame, and Alpha blends the buffer ontothe master frame buffer. Fifth, for each active audio segment, theprocessor obtains the media for the frame and places the media in anaudio buffer. The processor then applies one or more effects from theeffect list to the media, applies any transition in or out that overlapswith the frame, and adds the audio to the master audio buffer. Sixth,the processor writes the master video and audio frames to an output filestream.

Various optimizations of this process are possible. For example, wherethe frames are processed in order, it is beneficial to the currenteffects and inputs as the processor may reuse all or part of the effectsfor proximate frames. In order to implement some effects, the processormay need time access to the source media. For example, in someembodiments, stabilization requires that the processor have access to arange of source frames in order to calculate motion vectors. To supportthis, in one embodiment, the transformer module stores the input streamin storage, such as a FIFO buffer, which provides random access to anyindividual frames within the FIFO. Thus, the processor can access framesdirectly from the FIFO for any effect which needs this access.

In one embodiment, as the file is written out to the destination file,it immediately is queued for transfer to the destination location.Because the file is sequentially written, this file transfer can startimmediately before the entire file has been processed.

As described above, processing video effects on many devices can beprohibitively expensive in terms of time and resource consumption. Forexample excessive CPU usage results in high power consumption, runningdown the battery. Also, time delay waiting for results keeps the deviceunavailable for other use. In addition, even without constrainedcomputational resources, configuration of video processing can beproblematic. There are complicated steps setting up effects processing.Further, it can be very slow to develop processing tools that work on adiverse range of devices. For example, complex video effectsapplications that run native on the device require rewrites for everyimplementation. Further, implementing new effects and effect presets tobe performed on user devices would require updating and downloadingsignificant amounts of data. Also, allowing native code to run insidebrowser as a plugin or executable file (e.g., exe) on the devicepresents risky choice to user. For these reasons and the reasonsdescribed above, it is desirable to have a tool or system that allows auser to choose a video, choose and preview options, and send it off tobe processed remotely and delivered quickly and easily. Additionally, itis desirable to have the impact on phone usability minimized. Certainembodiments relate to methods and systems for media processing selectionand preview.

As discussed herein, embodiments of the invention make it easy with avideo enabled device to quickly choose a video and arrange to have itprocessed, personalized, and uploaded to video sites with a few quicksteps. For example, a user can shoot a video clip, assign the clip aname, quickly choose processing and personalization options, and sendthe clip off to be processed remotely. Advantageously, embodimentsdisclosed herein facilitate cost effective development of a tool thatworks to facilitate media personalization on a wide range of devices.Further, the tool has a low impact on the performance of the user devicewhile providing a hassle free experience for the user. In addition,modifications adding new effects can be implemented on the server makingthe update process transparent and simple for users.

For purposes of explanation, the functionality of the application module15 and its interface with a user and with the server 20 will bedescribed in greater detail below. As described above, the Applicationmodule 15 can run on a wide range of devices (e.g., a mobile phone orweb browser application). In one embodiment, in order to facilitatemedia personalization the Application module 15 first presents the userwith a set of available videos to choose from. Next, the applicationmodule 15 allows the user to select one or more options for editing thevideo. These options can include trim information, e.g., start and endpoints within a video. The options can also include a video tag. Forexample, the video tag may include a logo and the application module mayallow the user to modify logo parameters such as the image used for thelogo, the type style of the logo, transitions into and out of the logoclip, as well as the title style and animation. The options selected bythe user via the application module 15 can also include one or moreeffects to apply to the media. After the user makes the selections, theapplication module sends the video and selections to the transformermodule 25 over the network 30 to create the personalized video.

In order to make the process visually stimulating and easy to use, inone embodiment, the application module 15 visual feedback via the userinterface of the user device during the process of collecting user inputand, in some embodiments, as the video is being processed. TheApplication module receives sets of options that can be selected by theuser as well as previews or examples of how the various options look andsound when implemented with the uploaded media.

FIG. 4 shows a block diagram of components and an exemplary flow ofinformation in a system 200 in accordance with an embodiment of thepresent invention. It should be appreciated that FIG. 2 represents ahigh-level block diagram of the components used and the flow ofinformation between the components. Such flow of information ispresented here. For example, the application module 15 and server 20 maycommunicate as shown.

In one example, the application module 15 uploads 41 media clips to theserver 20. For example, as shown, the application module 15 uploads 41 aa video clip to server 20 and uploads 41 b a thumbnail image of thevideo clip to server 20. In one embodiment, the user may initiate theupload process. However, in some embodiments, the application mayextract the thumbnail without user direction and may schedule theuploading of video and thumbnails without further user input.

The application module 15 sets variables 43 a and options for the VideoTag 43 b based on user input. As described above, the Video Tag can bethought of as a template. The Video Tag has information Variables andOption choices that can be selected by a user. As described herein, theVideo Tag may be processed and turned into a Production, which is adefinition of how to assemble a final video clip.

The application module 15 requests a preview thumbnail and/or videoclip. The server 20 creates preview thumbnails and/or video previewswhich it streams 42 a and b back to the application module 15. Ingeneral, a thumbnail may be a low resolution version of a still frame orvideo clip. The requested thumbnail allows the user to what theprocessing will look like with the added effect without the CPU and timeoverhead of processing in full resolution. In one embodiment a singleframe is used as a thumbnail to show what a particular “look” will looklike. In other cases, a thumbnail comprises a video clip to illustratethe look over time, e.g., to preview a transition choice.

In one embodiment, server 20 also downloads new option choices, anoption set, related to the new preview media and provides the new optionchoices 44 to the application module 15. The option sets can be providedas separate files or embedded in Video Tag files.

The application module 15 integrates options and preview media into theuser interface (“UI”) and displays 45 a,b the previews and choices tothe user. The user makes a choice 46 via the user interface of theapplication. The application module 15 sends the choice to the server 20(e.g., sets variables and options for the Video Tag and requests newpreviews). This cycle of previews and additional inputs can continueuntil the user is happy with the final personalized media product.

In one implementation, the server includes a transformer module. Thistransformer module takes a Video Tag file and feeds media and parametersinto it, to then generate output either in the form of image thumbnailsor preview video clips. It should be appreciated that the present systemand method are not limited to Video and photos for input or output.Other media types, such as sound files apply equally well.

FIGS. 5A-F represent the graphical user interface of the applicationmodule 15 operating on a mobile phone having a touch screen interface.It should be appreciated that other devices (e.g., smartphone, PDA,etc.) may alternatively be used. Additionally, it should be appreciatedthat the application module 15 may be used in accordance with otherimplementations, for example a browser application written in Flash.

FIG. 5A illustrates a user interface for choosing a video to process andstarting preparation for processing. As shown, in one embodiment,application module 15 displays thumbnails of all available videos.Typically, there is an application programming interface (“API”) tohandle this, even if it is the file browser. The user selects a clip.Application module 15 starts uploading the clip to the server in thebackground. If the user switches video choices, application module 15cancels first stream and replaces with second stream.

FIG. 5B illustrates a user interface for setting the trim in a video,e.g., start and end points in the video clip being uploaded. In oneembodiment, the application module creates a widget to select start andend points in the clip. In general, the widget may be implementedaccording to a number of UI API's. In general, a visual control may besupplied in order to allow a user to select start and end points for aclip. In one embodiment, to facilitate selecting trim points, theapplication module uses an API to extract and display thumbnails. Theapplication overlays trim knobs that can be manipulated by the user toset in and out points. The application calls a playback API to previewbetween in and out points. Once selected, the application module 15stores the values and uploads them to server.

FIG. 5C illustrates a user interface for giving a text title to theclip. This function of the application also allows the user to set thename of the video for the title sequence. In one embodiment, Applicationmodule 15 presents a text entry widget, using a preferred mechanism ofthe device. Application module 15 stores the title text and upload toserver.

FIG. 5D illustrates a user interface for selection of a video tag forthe clip. In one embodiment, Application module 15 presents the userwith a palette of visual icons, representing the looks of the differentVideo Tags and the user chooses one. In one embodiment, the server 20streams down to the device 10 a palette of thumbnails, along with metainformation such as the name of the associated Video Tag. In someembodiments, this palette could is downloaded or updated in thebackground when the application module starts running or periodicallywhile running. The user chooses a Video Tag. Application module 15stores the choice. Application module 15 sends choice to the server 20.

FIG. 5E illustrates a user interface for modifying a Video Tag. In oneembodiment, the application module allows the user to modify one or moreparameters of the Video Tag, for example Title style. In oneimplementation, this option data comes from the Options portion of theVideo Tag definition. In one embodiment, application module 15 requestsOptions table (or similar list of choices) for each editable parameterfrom the server. Server 20 returns a list of choices to the application.In one implementation, the server 20 extracts the Options table from theVideo Tag file. Application module 15 presents the choices to the user.The user makes a choice. Application module 15 sends the selection databack to the server 20. In one implementation, this data takes the formof a token and string assigned to it.

In one embodiment, after receiving the selection data, the server 20immediately starts creating a preview clip. Generally, this is a smallformat preview file that can be generated and streamed in real time.While still processing, the Server 20 starts begins a download processto transfer the preview to the application module 15 for immediateviewing.

The application module saves the modified Video Tag design. In oneembodiment, the user can assign the Video Tag a new name. Theapplication module 15 uploads the final changes to the Video Tag, alongwith the new name and any attached media, e.g., sound effects or photos.Server 20 merges the uploaded changes into the original Video Tagdefinition and adds it to the user's database so that the user can reusethe video tag later.

FIG. 5F illustrates a user interface for selecting a Look forpersonalized media. Each Look is a combination of one or more effects tobe applied to the video. In one embodiment, as described, when a videois first selected, the Application module 15 sends a thumbnail toserver. Server 20 maintains a list of available looks, their names, andthe effect configurations to implement them. In one implementation, thisis an Option table and a Video Tag template that takes a thumbnail imageand Loop option as inputs. For each look, server 20 renders thethumbnail to generate a preview thumbnail. Server 20 downloads toapplication module 15 the set of preview thumbnails. Server 20 downloadsto application module 15 the set of names for the looks.

In one embodiment, Application module 15 creates a menu with the previewset of thumbnails. The User chooses a look from the menu. Applicationmodule 15 stores choice and uploads to the server. It is also possibleto modify a look. For example, a user can change the parameters of theeffect applied to the video to create a particular look.

Each effect can have a set of named preset configurations. In general apreset configuration is a particular configuration for an effect'sparameters. In one embodiment, each effect has a set of parameters thatcan be manipulated to change the behavior of the effect. For example, aneffect that creates a glow might have a parameter to control thebrightness of the glow and another parameter to control the color. Insome embodiments, there are between 4 and 10 parameters per effect. Eacheffect may be provided a set of these presets, each with a name.

In one embodiment, to facilitate modifying the looks, Application module15 requests preset thumbnails from server 20 for the look. Server 20runs preset effect over the thumbnails and downloads to the applicationmodule 15. The User chooses a preset and modifications. Applicationmodule 15 stores choice and uploads to the server 20. The User can alsoassign a modified Look a new name via the application 15. Applicationmodule 15 uploads the final changes to the Look, along with the name.Server 20 merges the uploaded changes into the original Look definitionand adds it to the user's database.

After all modifications and user input, the user can indicate a finalacceptance of the editing options via the application 15. Thisacceptance is stored by the application and transmitted to the server20, telling it to process the video. Once the acceptance is indicated,if the video has not been uploaded yet, the application also startssending the video stream to the server 20. In addition, if notpreviously sent, the application uploads the file name and destination,trim points, video tag selection (this may be encapsulated in a MyVideoTag chunk, e.g., text in XML that chooses a particular Video Tag withOptions and Variables), look selection (may be encapsulated in a MyLookchunk), and information on where to publish the final product (may beencapsulated in a MyUploads chunk).

In some cases, it is desirable to start processing the video streamshortly after the Application module 15 starts transmitting the video tothe Server 20. This enables immediate creation of preview clips or finalrenderings even if the file is not completely uploaded. However, in someembodiments, the server 20 cannot process the stream unless it has thenecessary header information. In certain files, such as some MP4 files,this header information may be placed in a location other than the headof the file (e.g., even at the tail).

FIG. 6 is a block diagram of a media clip format and file stream. Asshown, video file 605 includes a stream header or header chunk 607located at the end of the file. In order to facilitate quicker previewplayback, in some embodiments, the application module is configured toextract the header chunk 607 from the video file and to send the headerchunk 607 to the server at the beginning of the file stream 609. Asshown in FIG. 9, in some implementations, prior to streaming the file,the application module 15 parses the file and seeks to the header chunk.

Other embodiments of the systems and methods described above provideenhancements for the process of previewing video at the user device. Insome embodiments, a motion stabilization effect is performed effectivelywith two passes. In one example, the first pass of the two is to performmotion analysis. This first pass can be far more CPU intensive because,in some embodiments, the processor 36 analyzes adjacent frames tocalculate motion vectors (translation, scale, and rotation) between theframes. The second pass is a playback/render pass where the processor 36uses this information from the first pass to shift the images so thevideo is smooth. In one embodiment, rather than perform both passesmultiple times, the processor is configured to perform the motionanalysis once soon after the video is uploaded. The results of themotion analysis can then be used multiple times by the processor 36 forplayback render passes associated with different previews and with thefinal video product. By doing the complex first pass once per video,computing resources at the server are conserved and latency ingenerating previews is reduced.

In another embodiment, the processor 36 is configured to use differentresolutions for previews and final products. In particular, in oneembodiment, the processor 36 is configured to generate previews thathave a lower resolution that the final product. This results in betterdownload time of the previews that can appear to be real time to a user.In addition, by generating lower resolution previews, the server 20conserves computing resources. In one embodiment, the processor 36 isconfigured to receive an uploaded video clip and to generate anintermediate clip having a lower resolution based on the uploaded video.This intermediate, low resolution clip is then used by the processor 36for generating previews. Once all modifications have been made based onthe previews, the processor applies the final selections of the user tothe original, full resolution video to generate the final output.

The parameters that can be manipulate for lowering the resolution of thepreview can include, depending on the embodiment, frame width andheight, compression bit rate, and the actual frames per second. Forexample, if the normal FPS is 30 and preview is generated at 15 fps,then any effects only need to be applied half as frequently, cuttingprocessing in half.

In another embodiment, it is desirable to begin generating previewsbefore the entire video clip has been uploaded. To accomplish this, theprocessor is configured to start a transcode process as soon as the filestarts uploading from the user device 10 to the server 20. In oneembodiment, this transcode operation is requested by the application 15.Along with the request, the application provides the server withinformation regarding the format for the transcode or other information.

In some embodiments, these optimizations can be combined. For example,in one embodiment, the processor 36 configured to initiate a transcodeas soon as a video clip begins uploading from a user device. In someembodiments, this transcode is one to many in that it renders outmultiple streams simultaneously, each stream having a different fileformat with different data. For example, one output can be for motionanalysis while another output can be used for low resolution previewgeneration. In one embodiment, this transcode process runs synchronouslywith the uploaded video, so as soon as a new block comes in, the outputstreams of data are prepared. In this embodiment, the streams can all beread before the file is closed. This makes it possible for the processorto provide the user device with a preview video render before the uploadof the video is completed. In this situation, the preview render willnot play all the way through the end of the clip, but it can work witheverything that has been uploaded and processed up to the time thepreview is provided.

In another embodiment, the server 20 and transformer module 25 implementa scheduling process for processing and personalizing media.Advantageously, the scheduling process ensures that the user experienceis optimal, even when there is heavy loading on the server. Wheremultiple servers implement the functionality described with respect tothe server 20, this is also allows optimization of server resourceallocation so that excess resources aren't wasted and so that theservers can respond to spikes in activity without significant delayinvolved in waiting for new servers to spin up. In one embodiment, thetransformer module implements a scheduling queue combined withpriorities assignments to ensure that higher priority renders canoverride lower priority work.

In one embodiment, every render request from a user device is assigned atime stamp and time stamp and processing priority by the servertransformer module. The transformer module places render requests in aqueue. The transformer module organizes the queue first by priority andthen by time stamp. Where multiple servers are used, each cloud instanceallows a fixed number of processing requests simultaneously. In oneembodiment, the fixed number is a function of how many cores theinstance has. For example, a server with 8 core processors might support16 concurrent processes. This ratio can be adjusted to tune for optimalperformance.

In order to implement the scheduling process, the transformer moduleassigns different priority levels to different types of tasks. In oneembodiment, communications between the user device 10 and server aregiven the highest priority. The next highest priority, or high, is givento creation of a video clip for immediate playback, e.g., where a userpressed the play button on the interface for the application. The nexthighest priority, or medium priority, is given to the initial uploadtranscode process that generates the intermediate files for playback andmotion analysis. The lowest priority, or low priority, is given to finalrendering for publication and sharing.

By assigning priorities in this manner, the transformer module ensuresthat the real time behavior that is necessary for a favorable userexperience continues even when the servers are maxed out and waiting fornew servers to come on line. During these peak periods, the finalrenders end up running in the background or queued for later until thelog jam is over.

In another embodiment, the server 20 also performs load balancingbetween a plurality of servers that implement the functionalitydescribed with respect to the transformer module. The server 20 performsload balancing by assigning application module sessions to differentservers. When a new application module session starts, the server 20selects a server for the session based on one or more of CPU usage andprocessing queues for various servers. The selected server is assignedthe session. In some embodiments, the server 20 also requests a newserver instance when it sees processing activity over a threshold. Insome embodiments, this threshold is based on CPU usage, processingqueues, or both. However, as a new server instance can take anywherefrom minutes to an hour to come online, the scheduling process describedabove allows the servers to ensure that high priority work continueswhile lower priority jobs get delayed.

FIG. 7 illustrates a flow chart 703 describing the operation of thesystems 100 according to another embodiment. Flow chart 705 indicatesthe status of the media being processed at corresponding steps of flowchart 703. At step 707 the application module 15 presents the user withthe choice of selecting a previously recorded media element topersonalize or recoding a new media element. Either selection results inthe identification of a media element to be processed. As shown at 708,at this point the media is at the full resolution stored on the userdevice.

Continuing at step 713, the application module trims and transcodes themedia for upload. This is an optional step that can be performed basedon the device that hosts the application module. For example, in oneembodiment, if the device does not have the ability to open a trimmernatively, then the application module forces the duration of the clip toa reasonable limit. In another embodiment, the application module waitsto create a trimmer at a step described below. As shown at 714, at thispoint the media is at its final resolution but is highly compressed.

Continuing at step 719, the application module uploads the media to theserver. As discussed above, if the media format puts header informationat the end, the application module transmits the last header informationof the media file to the server first. The server receives the endheader of the video and writes it to the end of the file, so the file isnow full length, but with empty data for the entire file except for thevery end. The server then reads the end block to access the format datawhich includes the total media length and any other necessaryinformation. After resolving the header issue, the application modulestarts transmitting the video, from the start. This uploading processcan continue in the background while other steps are performed. Theserver starts receiving the video, storing it in the file sequentially,up to the end block.

Continuing at step 725, the server transcodes the media to multipleformats. In one embodiment, the server immediately starts a transcodeprocess while the file is still being uploaded. In one embodiment, thetranscode process runs as a medium priority thread and reads the sourcefile once but generates one or more output data files simultaneously. Inone embodiment, the type of output files generated is determined basedon input from the application module that requests particular kinds offiles. As the source file comes in, the transformer module reads eachframe at a time and passes each frame to one or more writers, each ofwhich uses the image data to generate an output file. Output file typesinclude an intermediary preview transcode 726, a motion analysis file728, and thumbnail files 732.

The intermediary preview transcode is a low resolution/low frame rateversion of the original file. This temporary file is used to createpreviews quickly. It is optimized to be streamed from the server fromhard disk with low CPU usage. In one embodiment, the output transcode isimplemented by the transformer module as a video file writer thatconverts the image to a lower resolution and writes it to the outputstream. This video format can be exactly the frame rate and resolutionof the previews that it will be used to generate. However, it can be lowcompression since it is on a local drive so bandwidth is not an issue.

The motion analysis file stores the motion vectors frame by frame. Tocreate this file, the transformer module compares successive frames,looking for motion, rotation, and scale changes. It also looks forrolling shutter distortion. The output file is simply a set of motionvectors, one for each frame.

Thumbnail files are a series of jpeg, or other format, files that arewritten out at intervals determined by the application module. Thesethumbnails can then be streamed back down to the application module tobe used in a trimmer. Note that this primarily used for devices thatrequire the thumbnails, such as devices that use Flash. iPhone, forexample, may not use these thumbnails. In one embodiment, if theapplication omitted a trimmer previously, it starts reading thethumbnail files as they become available and displays them for use inselecting trim points.

Continuing at step 731, the server creates preview renders and downloadsthe previews to the application module. After beginning to upload themedia the application module is immediately able to start using thetranscoded files to perform different operations in real time. Each ofthese files can be read from start to current upload point, so theapplication doesn't need to wait for a full upload before the user canstart making choices and previewing them. For example, the applicationcan begin to use the downloaded previews to implement a trimmer. In oneembodiment, the trimmer provided by the application module downloadsthumbnails dynamically and draws them in a strip. Although thethumbnails progressively fill over time, the operation of setting thepoints can still proceed. This just sets start and end points which willbe used in the final render. The application module can also use thepreview to implement an effects preview. The server uses the transcodedfile to generate a preview to view the effect in real time as applied tothe clip. The clip is downloaded to the application and shown to theuser. Similarly, the server can generate video tag previews that takethe user's choices for photo, text, and style. The preview can bedownloaded and shown to the user via the application module. Thisgenerates a preview to view what it might look like. The server can alsogenerate a final project preview. This is used to let the user see whatthe entire clip looks like, but in lower resolution.

In more detail, an effects preview can be generated by the server. Theserver uses the transcoded file to generate a preview to view the effectin real time as applied to the clip. In particular, a user clicks on theplay button on the user interface of the application module. Theapplication sends the instruction to the server to create a new renderusing the preview transcode as input with the selected effect applied.In one embodiment, the instruction is used to generate an XML Productionfile by the server. In some embodiments, response times for previews isimportant for user satisfaction, accordingly, the effects previewgeneration is assigned a high priority by the server. In someembodiments, because the transcoded input and the output render are bothlow resolution, the processing engine is able to create the file in realtime. If stabilization is required, the transformer module also appliesthis in the render pass. The transformer module uses the analysis datafrom the first pass to calculate how to move the image to compensate forjitter.

After this processing begins, the server starts to stream the preview tothe application module immediately. Because of the low resolution of theinput and output, the server is able to stream in real time. Theapplication module starts playing the streamed media immediately. To theuser, the behavior is identical to clicking on a video player showing apreviously rendered video. There is some latency from each of thestages, but it can be remarkably close to the time required to start astatic file playing over the internet. However, it can only play up tothe current upload point, then it stops.

The server may also generate a Video Tag preview. In this preview, theserver takes the user's choices for photo, text, and style and generatesa preview to view what it might look like. As discussed above, the Userenters a name for the Tag using the application. The application uploadsthe text to the server. The User also chooses a picture for the mediausing the application. The application uploads an image, e.g., a jpegfile, to the server. The User chooses a Video Tag style to use via theapplication and the user clicks on the Play button on the application'suser interface. The application sends the instructions to the server tocreate a new render using the Video Tag project file with User's Name,and Image. In some embodiments, this render also incorporates the startof the uploaded video, so the transition can be demonstrated. In someembodiments, the application requests the render in sufficiently lowresolution for real time response. At the server, the render starts as ahigh priority process and the server immediately starts streaming to theapplication for playback.

The server can also generate a final project preview. In someembodiments, this occurs when a user clicks on the play button in theapplication user interface. In response to an indication of theselection by the user from the application, the server generates thefinal production file but with the preview, i.e., lower resolution. Theserver starts writing the file as a high priority process. Onceprocessing begins, the server starts to stream the file back to theapplication. Once the application begins to receive the stream it beginsplayback.

Continuing at step 737, the server generates the final render. Via theapplication user interface, a user chooses to publish the video clip.The server responds to an indication from the application of thisselection by generating the final render in high resolution that will bepublished to the destination (Facebook, Youtube, etc.) The final renderis queued by the server as a low priority process. In addition, if thereare other renders in the queue ahead of it, the final render waits.After requesting the final render, the application does not need toprovide any additional input and can leave the session. Eventually, therender request makes it to the head of the queue and is assigned to aprocessor which starts the job as a priority process so that it willonly run when higher priority requests are not actively processing. Whenthe render is finished, the server sends to the video file to thedestination (Youtube, Facebook, etc.) and then notifies the user thatthe video has been published.

It should be appreciated that the transformer module and/or processormodule can reside on a cloud server, a local computer or other device.All peripheral devices including a monitor or display and input/outputdevices can be used by the user to perform such editing as needed.Additionally, in embodiments where the transformer module and/orprocessor module reside on a cloud server, communication links such aswireless or wired connections (e.g., a network connection) are providedso that the user can access the transformer module and/or processormodule from the cloud server from his local computer or other device(e.g., via application module).

In fact, in certain embodiments, the cloud server can make decisionsabout what materials to make available to a user, including: Intros,e.g., the personality, or “tag” templates; effects; destinations forposting; and video clip to use in the intro or outro, so for example apartner promotion can be substituted. This information can dynamicallybe collected, based on different inputs. For example, in certainimplementations, the user can inform the server with the GPS location.This can be done automatically. In other words, the user's device canhave a GPS circuit included in it, or can get GPS assisted coordinatesform the network. This information can then be sent to the cloud serverand can influence what video tag selections are made available to theuser. For example, the cloud server can then determine that the user isat a specific location like a theme park, convention center, movietheatre, etc., and provide, e.g., intros and titles based on where theuser is located.

There can also be a mechanism on the client device for identifyingspecial promotions with 3rd party, for example, typing in a specialcode, scanning a bar code or Q-code, etc. The cloud server can alsocheck the time, location, or both and determined whether there is apromotion that is at a specified time, location, or both. Further,information in the database that was collected elsewhere can be used todetermine special promotions or other information. For example, a usermay have been signed up through a promotion with a partner, which can beflagged on the back end.

The cloud server can also be configured to track certain informationsuch as which intros and effects the client uses each time the clientcreates a video. With this information, the cloud server can track usagestatistics and correlate with other user demographics, etc. This can beused to constantly update promotions, titles, videos, and other effects.This can also be used to determine which types of effects and intros tocreate next, which ones to recommend, which ones to charge a premiumfor, etc. This information can also be used to report to promotionpartners, potentially for revenue generation, e.g., invoice for thenumber of intros used.

Also, because the server manages what each user has available, it iseasy to integrate mechanisms for monetizing via the selling of Effects,Intros, and other items. For example, many effects and intros can beoffered as “free”. New effects and intros can dynamically show up in theclient device, labeled “premium”. When the user chooses a premium effector intro, it can be previewed, but must be purchased, e.g., through inapp purchasing mechanism, in order to use. Once purchased, the databaserecords that the user has the rights to this material. This right moveswith the user to all devices in the account. The user can have theoption to purchase a subscription, which enables use of all premiumcontent. This simply sets a flag in the user account, allowing use ofall materials.

Those of skill will appreciate that the various illustrative logicalblocks, modules, and algorithm steps described in connection with theembodiments disclosed herein can often be implemented as electronichardware, computer software, or combinations of both. To clearlyillustrate this interchangeability of hardware and software, variousillustrative components, blocks, modules, and steps have been describedabove generally in terms of their functionality. Whether suchfunctionality is implemented as hardware or software depends upon thedesign constraints imposed on the overall system. Skilled persons canimplement the described functionality in varying ways for eachparticular application, but such implementation decisions should not beinterpreted as causing a departure from the scope of the invention. Inaddition, the grouping of functions within a module, block or step isfor ease of description. Specific functions or steps can be moved fromone module or block without departing from the invention.

The various illustrative logical blocks and modules described inconnection with the embodiments disclosed herein can be implemented orperformed with a general purpose processor, a digital signal processor(DSP), application specific integrated circuit (ASIC), a fieldprogrammable gate array (FPGA) or other programmable logic device,discrete gate or transistor logic, discrete hardware components, or anycombination thereof designed to perform the functions described herein.A general-purpose processor can be a microprocessor, but in thealternative, the processor can be any processor, controller,microcontroller, or state machine. A processor can also be implementedas a combination of computing devices, for example, a combination of aDSP and a microprocessor, a plurality of microprocessors, one or moremicroprocessors in conjunction with a DSP core, or any other suchconfiguration.

The steps of a method or algorithm described in connection with theembodiments disclosed herein can be embodied directly in hardware, in asoftware module executed by a processor, or in a combination of the two.A software module can reside in RAM memory, flash memory, ROM memory,EPROM memory, EEPROM memory, registers, hard disk, a removable disk, aCD-ROM, or any other form of machine or computer readable storagemedium. An exemplary storage medium can be coupled to the processor suchthat the processor can read information from, and write information to,the storage medium. In the alternative, the storage medium can beintegral to the processor. The processor and the storage medium canreside in an ASIC.

The above description of the disclosed embodiments is provided to enableany person skilled in the art to make or use the invention. Variousmodifications to these embodiments will be readily apparent to thoseskilled in the art, and the generic principles described herein can beapplied to other embodiments without departing from the spirit or scopeof the invention. For example, while the present invention has beendescribed as encompassing a method and tool or system for personalizingmedia, it should be understood that the tool can be implemented aselectronic hardware, computer software, or combinations of both. Thus,it is to be understood that the description and drawings presentedherein represent a presently preferred embodiment of the invention andare therefore representative of the subject matter which is broadlycontemplated by the present invention.

In accordance with an implementation, a transformer module isimplemented on a web server. The transformer module is used to apply aVideo Tag to media to generate personalized output. The transformermodule manages a database of users and Video Tags. A remote applicationmodule invokes the transformer module to process a video clip. Theremote application module sends the transformer module the video clip(or other media), with instructions for processing (Video Tag choice),and desired destination.

What is claimed:
 1. A system for generating edited video, the systemcomprising: a non-transitory computer readable medium configured tostore computer instructions; and a processor communicatively coupledwith the non-transitory computer readable medium and configured toexecute the computer instructions stored therein to: receive, from auser device, an unedited video, a selection of a first templateincluding a first option for a first feature to be overlaid on at leasta portion of an unedited video, and a user-input value for the firstoption; generate a production file from the first template including byreplacing a default value for the first option with the user-input valuefor the first option; generate, based at least in part on the productionfile, an edited video having the first feature overlaid on at least aportion of the unedited video transmit a preview of the edited video tothe user device in real-time while generating at least a portion of theedited video.
 2. The system of claim 1, wherein the first furtherincludes a second option for a first effect to apply to the uneditedvideo.
 3. The system of claim 1, wherein the first templates furtherincludes a third option for a first embellishment to add to the uneditedvideo.
 4. The system of claim 3, wherein the first embellishmentcomprises one of a style, a sound effect, a transition, and a musictrack.
 5. The system of claim 1, wherein the production file includesinstructions for the processor to generate the edited video.
 6. Thesystem of claim 1, wherein the first feature comprises one or an image,a video track, and a title frame.
 7. The system of claim 1, wherein thefirst template comprises one of a default template and a user customizedtemplate associated with a user account.
 8. The system of claim 7,wherein the user account is associated with at least one recordsincluding login credentials for a video sharing website.
 9. The systemof claim 1, wherein the preview video has a lower resolution than theedited video.
 10. The system of claim 9, wherein the processor isconfigured to transmit the preview video to the user device whilereceiving at least a portion of the unedited video from the user device.11. The system of claim 1, wherein the first option comprises one ormore parameters defining at least one aspect of the first feature. 12.The system of claim 11, wherein the at least one aspect of the firstfeature comprises a duration, an opacity, a background, and textsassociated with the first feature.
 13. A system for generating editedvideo, the system comprising: a remote server, comprising: anon-transitory computer readable medium configured to store computerinstructions; and a processor communicatively coupled with thenon-transitory computer readable medium and configured to execute thecomputer instructions stored therein to: receive an unedited video,selection of a first template including a first option for a firstfeature, and a user-input value for the first option; generate aproduction file from the first template including by replacing a defaultvalue for the first option with the user-input value for the firstoption; generate, based at least in part on the production file, anedited video having the first feature overlaid on at least a portion ofthe unedited video; and transmit a preview of the edited video in realtime while generating at least a first portion of the edited video; anda user device comprising: a non-transitory computer readable mediumconfigured to store computer instructions; and a processorcommunicatively coupled with the non-transitory computer readable mediumand configured to execute the computer instruction stored thereon to:receive, from a user, a selection of the first of a plurality oftemplates, wherein the first template includes the first option for thefirst feature to be overlaid over at least a portion of the uneditedvideo; receive, from the user, the user-input value for the firstoption; transmit, to the remote server, the unedited video, theselection of the first template, and the user-input value for the firstoption; and receive, from the remote server, a preview of an editedvideo having the first feature overlaid on at least a portion of theunedited video.
 14. The system of claim 13, wherein the first templatefurther includes a second option for a first effect to apply to theunedited video.
 15. The system of claim 14, wherein the first furtherincludes a third option for a first embellishment to add to the uneditedvideo.
 16. The system of claim 15, wherein the first embellishmentsinclude at least comprises one of a style, a sound effect, a transition,and a music track.
 17. The system of claim 16, wherein the productionfile includes instructions for the processor to generate the editedvideo.
 18. The system of claim 13, wherein the first feature comprisesone or more of images, a video track, and a title frame.
 19. The systemof claim 13, The system of claim 1, wherein the first template comprisesof a default template and a user customized template associated with auser account.
 20. The system of claim 19, the user account is associatedwith at least one records including login credentials for a videosharing website.
 21. The system of claim 13, wherein the preview videohas a lower resolution than the edited video.
 22. The system of claim21, wherein the remote server is configured to transmit the previewvideo to the user device while receiving at least a portion of theunedited video from the user device.
 23. The system of claim 13, whereinthe user device is configured to receive at least a portion of thepreview video from the remote server while transmitting at least aportion of the unedited video to the remote server.
 24. The system ofclaim 13, wherein the first option comprises one or more parametersdefining at least one aspect of the first feature.
 25. The system ofclaim 24, wherein the at least one aspect of the first feature comprisesa duration, an opacity, a background, and texts associated with thefirst feature.